Goto

Collaborating Authors

 subspace diffusion


Disentangling by Subspace Diffusion

Neural Information Processing Systems

We present a novel nonparametric algorithm for symmetry-based disentangling of data manifolds, the Geometric Manifold Component Estimator (GEOMANCER). GEOMANCER provides a partial answer to the question posed by Higgins et al.(2018): is it possible to learn how to factorize a Lie group solely from observations of the orbit of an object it acts on? We show that fully unsupervised factorization of a data manifold is possible the true metric of the manifold is known and each factor manifold has nontrivial holonomy - for example, rotation in 3D. Our algorithm works by estimating the subspaces that are invariant under random walk diffusion, giving an approximation to the de Rham decomposition from differential geometry. We demonstrate the efficacy of GEOMANCER on several complex synthetic manifolds. Our work reduces the question of whether unsupervised disentangling is possible to the question of whether unsupervised metric learning is possible, providing a unifying insight into the geometric nature of representation learning.


Review for NeurIPS paper: Disentangling by Subspace Diffusion

Neural Information Processing Systems

Strengths: This paper provides new insights into the problem of disentangling independent latent factors, viewed here through the lens of factorizing groups of transformations on a data manifold. The authors base their construction on the de Rham decomposition, which itself is based on the holonomy group that considers parallel transport over loops on a manifold. Essentially, the authors seek to extract multiple representations of input data, such as each of them encodes a submanifold with holonomy group independent from all other submanifolds. This provides an important formalism to an important problem that is often ill defined, with mostly heuristic qualitative goals that depend on specific applications rather than studied with rigor. The construction itself here is based on extending the work of Singer and Wu on vector diffusion maps, which enriches more traditional manifold learning by encoding information about tangent spaces and the operation of the connection Laplacian on tangent vector fields.


Review for NeurIPS paper: Disentangling by Subspace Diffusion

Neural Information Processing Systems

The paper reduces the question of whether unsupervised disentangling is possible to the question of whether unsupervised metric learning is possible, providing a unifying insight into the geometric nature of representation learning. All reviewers think the theory and algorithm developed for decomposing Lie group is novel. The paper is missing citations of previous work related to fibre bundle, and manifold learning, which the authors should remedy in the revised version.


Subspace Diffusion Generative Models

Jing, Bowen, Corso, Gabriele, Berlinghieri, Renato, Jaakkola, Tommi

arXiv.org Artificial Intelligence

Score-based models generate samples by mapping noise to data (and vice versa) via a high-dimensional diffusion process. We question whether it is necessary to run this entire process at high dimensionality and incur all the inconveniences thereof. Instead, we restrict the diffusion via projections onto subspaces as the data distribution evolves toward noise. When applied to state-of-the-art models, our framework simultaneously improves sample quality -- reaching an FID of 2.17 on unconditional CIFAR-10 -- and reduces the computational cost of inference for the same number of denoising steps. Our framework is fully compatible with continuous-time diffusion and retains its flexible capabilities, including exact log-likelihoods and controllable generation. Code is available at https://github.com/bjing2016/subspace-diffusion.


Dimensionality-Varying Diffusion Process

Zhang, Han, Feng, Ruili, Yang, Zhantao, Huang, Lianghua, Liu, Yu, Zhang, Yifei, Shen, Yujun, Zhao, Deli, Zhou, Jingren, Cheng, Fan

arXiv.org Artificial Intelligence

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension. We argue that, considering the spatial redundancy in image signals, there is no need to maintain a high dimensionality in the evolution process, especially in the early generation phase. To this end, we make a theoretical generalization of the forward diffusion process via signal decomposition. Concretely, we manage to decompose an image into multiple orthogonal components and control the attenuation of each component when perturbing the image. That way, along with the noise strength increasing, we are able to diminish those inconsequential components and thus use a lower-dimensional signal to represent the source, barely losing information. Such a reformulation allows to vary dimensions in both training and inference of diffusion models. Extensive experiments on a range of datasets suggest that our approach substantially reduces the computational cost and achieves on-par or even better synthesis performance compared to baseline methods. We also show that our strategy facilitates high-resolution image synthesis and improves FID of diffusion model trained on FFHQ at $1024\times1024$ resolution from 52.40 to 10.46. Code and models will be made publicly available.